home *** CD-ROM | disk | FTP | other *** search
Text File | 1989-09-20 | 43.7 KB | 1,686 lines |
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- REGULAR EXPRESSION DLL
-
-
-
-
-
-
-
-
- A Dynamic Link Library
-
- for Microsoft Windows
-
- by Windfall Software Systems
-
- 40 Windfall Lane
-
- Marlboro, NJ 07746
-
-
-
- CompuServe ID: 71330,3614
-
-
-
-
- Copyright Windfall Software Systems, 1989
-
- All Rights Reserved
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- A. Software License 1
-
- B. Concepts and Facilities 2
-
- 1. Package Contents 2
- 2. Regular Expressions 3
- 3. Sample Regular Expressions 5
-
- C. Functions Overview 8
-
- D. Applications 10
-
- E. Demonstration Program 13
-
- F. Function Reference 14
-
- 1. RxMatch - Match a Regular Expression 15
- 2. RxExtract - Extract a Matching Group 17
- 3. RxReplace - Replace Placeholders 19
- 4. RxMsgText - Build Error Message 21
-
- G. Registration Form 23
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- A. SOFTWARE LICENSE
-
-
-
- You are granted a limited licence to use the Regular
- Expression DLL on a private, non-commercial basis and to make
- copies of this package and distribute them to other users,
- under the following conditions:
-
- ■ This package must be copied and/or distributed in
- unmodified form, complete with the file containing this
- licence information.
-
- ■ No fees or other compensation may be requested or
- accepted by any licensee, except that clubs and user
- groups may charge a nominal fee not to exceed $10 for
- expenses and handling.
-
- ■ No part of the software contained in this package can be
- distributed with any other product or service.
-
- If you want to use this software in a different way, you can
- use this package for evaluation only. If it fits your
- requirements, you can obtain a typical nonexclusive licence to
- use the software and related documentation, on a single
- computer at a time, and distribute derivative works. To obtain
- this licence, mail the Registration Form (last page) and a
- registration fee of $10 to the address shown on the form. The
- licence will be mailed to you. You can also use this form to
- order the complete source code of this package and its
- internal documentation.
-
-
-
- You, the Customer, assume all responsibility for the
- selection of this package as appropriate to achieve the
- results intended by the Customer. The software of the Regular
- Expression DLL is provided "as is" without warranty of any
- kind, either expressed or implied, including, but not limited
- to the implied warranties of merchantability and fitness for a
- particular purpose. In no event will Windfall Software Systems
- be liable for any damages arising out of the Customer use of
- the software, including, loss of data or profits, loss of use
- or other economic loss, or indirect, incidental, consequential
- or special damages of any kind, even if Windfall Software
- Systems has been advised of the possibility of the same. In no
- event shall Windfall Software Systems' liability for any
- damages exceed the price paid for the license to use the
- software, regardless of the form of the claim.
-
-
-
-
-
-
-
- 1
-
-
-
-
-
-
-
-
-
- B. CONCEPTS AND FACILITIES
-
-
-
- This package constitutes a dynamic link library facility (DLL)
- designed to perform regular expression searches and other
- related operations in a Windows application.
-
- A regular expression is a string that defines a pattern of
- text by using certain special characters. Those special
- characters let you specify optional choices, repetitions and
- character classes in such a way that a given regular
- expression matches not one string but all strings having some
- selected properties.
-
- Regular expressions are often supported by text editors.
- Most likely, the one you use provides a generalized search
- command that uses some form of regular expressions to define
- text patterns.
-
- Although programmers make use of regular expressions while
- editing programs and using other programming tools, it is not
- a common practice to use regular expression routines in the
- applications coming out from this work. Yet, as we will try to
- demonstrate in a few examples, regular expression routines can
- simplify many typical operations.
-
-
- 1. Package Contents
-
- The following files make up the evaluation version of the
- package:
-
- ■ WSSRX1.TXT Documentation (this file).
-
- ■ WSSRX1.EXE Dynamic link library.
-
- ■ WSSRX1.LIB Import library for WSSRX1.EXE.
-
- ■ WSSRX.H Interface definitions/declarations.
-
- ■ RXTEST.EXE Demonstration program.
-
- The header file (WSSRX.H) and the import library (WSSRX1.LIB)
- are necessary to compile and link programs that use the
- dynamic link library. The header file should be copied to the
- directory pointed by the INCLUDE environment variable. The
- import library - to the directory defined by the LIB
- environment variable. This is the simplest setup. If you
- choose different directories, you will have to adjust the
-
-
-
-
-
-
- 2
-
-
-
-
-
-
-
-
- #include directives and the linker parameters. The library
- itself (WSSRX1.EXE) is the only component needed by a compiled
- program.
-
-
- 2. Regular Expressions
-
- Regular expressions describe more or less complex text
- patterns. A simple pattern is merely a character, such as x,
- or a string of characters taken literally, such as ABC.
- Regular expressions like that represent specific strings,
- character for character. More complex patterns use special
- characters that represent not individual strings but specific
- context.
-
- The following items define elementary regular expressions
- matching a single character:
-
- ■ An ordinary character matches itself. An ordinary
- character is a character other than one of the following
- special characters:
-
- \ ^ $ . [ | { } * + ?
-
- ■ A backslash (\) followed by a character matches that
- character, even if the character alone is special. Note
- that the C language uses this character in a similar
- fashion, so to specify a backslash in a C string
- constant, you have to use it twice, e.g. "\\{".
-
- ■ A caret (^) matches itself except when it appears at the
- beginning of the entire regular expression. The meaning
- of this character at the beginning is defined later.
-
- ■ A dollar sign ($) matches itself except when it appears
- at the end of the entire regular expression. The meaning
- of this character at the end is defined later.
-
- ■ A period (.) matches any character.
-
- ■ A non-empty string enclosed in square brackets ([]) is
- called a character class. It matches any one character in
- that string. If, however, the first character of the
- string is a caret (^), the character class matches any
- character except the characters in the string. The caret
- represents itself when it appears somewhere else in the
- string. The minus sign (-) may be used to represent a
- range of consecutive characters. For example, a-z
- represents a lower case character. The minus represents
- itself when it appears first (possibly after an initial
- caret) or last in the string. The right square bracket
-
-
-
-
-
-
- 3
-
-
-
-
-
-
-
-
- stands for itself if it is the first character within the
- string (after an initial caret, if any). All other
- characters defined above as special represent themselves
- in a character class (e.g. [\.] means "backslash or
- period").
-
-
-
- The following rules can be used recursively to construct more
- complicated regular expressions:
-
- ■ An elementary regular expression is a regular expression
- matching a single character as described earlier.
-
- ■ A concatenation of regular expressions is a regular
- expression that matches the concatenation of strings
- matched by each component of the concatenation.
-
- ■ An alternative, i.e. two regular expressions separated by
- an |, is a regular expression that matches a string
- matched by at least one of the components. If both
- components match, the preference is given to the left
- one.
-
- ■ A group, i.e. a regular expression enclosed in braces
- ({}), is a regular expression that matches the same
- string as the enclosed expression.
-
- ■ An iteration, i.e. an elementary regular expression or a
- group followed by an asterisk (*), is a regular
- expression that matches zero or more occurrences of the
- string matched by the expression preceding the asterisk.
- If there is any choice, the longest leftmost string that
- facilitates a match is chosen.
-
- ■ A non-empty iteration, i.e. an elementary regular
- expression or a group followed by a plus (+), is a
- regular expression that matches one or more occurrences
- of the string matched by the expression preceding the
- plus. If there is any choice, the longest leftmost string
- that facilitates a match is chosen.
-
- ■ An option, i.e. an elementary regular expression or a
- group followed by a question mark (?), is a regular
- expression that matches zero or one occurrence of the
- string matched by the expression preceding the question
- mark. If there is a choice, the match with one occurrence
- is chosen.
-
- ■ A caret (^) at the beginning of a regular expression
- constrains the match to an initial segment of a string.
-
-
-
-
-
-
- 4
-
-
-
-
-
-
-
-
- ■ A dollar sign ($) at the end of a regular expression
- constrains the match to a final segment of a string.
-
- The above rules introduce some special characters that behave
- like operators and establish the precedence criteria for them.
- For example, because iterations bind single characters or
- groups, a|b+ matches a single a or one or more b's. To match
- one or more a's or one or more b's, we would have to use
- {a|b}+. Note, that a caret and a dollar sign are somewhat
- different than other operators. They do not have any special
- meaning unless the stand on the beginning or end.
-
-
- 3. Sample Regular Expressions
-
- Each of the examples below consists of three parts:
-
- regular expression
-
- one or more strings to be matched against that expression
-
- explanation
-
- The matching part of each sample string is shown to the right
- of the string.
-
-
-
- [a-zA-Z0-9]
-
- (((ABC))) A
-
- Matches a single letter or digit.
-
-
-
- [a-zA-Z0-9]+
-
- (((ABC))) ABC
-
- 201-777-1212 201
-
- Matches a "word", i.e. a string of letters and/or digits
- delimited by something else.
-
-
-
- [a-zA-Z][a-zA-Z0-9]*
-
- (32, -x2) x2
-
-
-
-
-
-
-
- 5
-
-
-
-
-
-
-
-
- Matches an ALGOL-like identifier, i.e. a string of letters
- and/or digits starting with a letter.
-
-
-
- Y{AB|CD}Z
-
- XXYCDZXX YCDZ
-
- Matches YABZ or YCDZ.
-
-
-
- [a-zA-Z]*{ie|ei}[a-zA-Z]*
-
- selected properteis properteis
-
- Matches words, i.e. strings of letters, that contain ie or ei
- (and are often misspelled).
-
-
-
- [0-9]+{\.[0-9]+}?{E{+|-}?[0-9]+}?
-
- - 12.79; 12.79
-
- a = 3.14E-2, 3.14E-2
-
- Matches a decimal number with an optional fraction and an
- optional exponent. As you will see later, the groups
- surrounded by braces not only establish scopes for the ?
- operators but also can be used to extract parts of the matched
- number.
-
-
-
- [a-z]*[,.?!]?
-
- xxx, yyyy. xxx,
-
- 1234567890 match on a null string
-
- Matches a possibly empty string of letters followed by an
- optional delimiter? Yes, but most likely this is not what you
- want. This pattern will show a match with any string, because
- all of its parts are optional. It may match something
- non-trivial if the test string starts with the right
- combination of characters. Otherwise, it will match the empty
- string that exists at the beginning of any string. All strings
- start with "a possibly empty string of letters followed by an
- optional delimiter". Be careful with the ? and * operators. In
-
-
-
-
-
-
- 6
-
-
-
-
-
-
-
-
- most cases, they have to be used within some non-empty context
- to yield good results. Sometimes, you can use them in the way
- presented here but you should check if the matching substring
- is non-empty. In both cases shown above, the library indicates
- a match. To find out that the match is non-trivial, you have
- to check if the size of the matching string is nonzero.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- 7
-
-
-
-
-
-
-
-
-
- C. FUNCTIONS OVERVIEW
-
-
-
- The primary function in this library is the RxMatch function.
- This function searches a given string looking for a substring
- that matches a given regular expression. This operation
- consists of two steps. First, the function parses the regular
- expression and converts it into an internal form which
- facilitates faster matching. This step may fail if the
- function discovers a syntax error in the regular expression.
- If the parsing is successful, the next step tries to match the
- internal form of the expression with the given string. If
- there is a matching substring, the function returns a non-zero
- value. Otherwise, it returns zero.
-
- A call to the RxMatch function always supplies a regular
- expression and a string to match. However, to avoid repeated
- parsing of the same regular expression, the library provides a
- caching facility. The cache holds a number of recently used
- regular expressions, together with their translations. If the
- expression supplied in the current call is the same as one in
- the cache, the parsing is bypassed. The cache capacity depends
- on the size of the regular expressions held in it and their
- complexity. For typical applications, you can assume that a
- few most recent expressions will be found in the cache.
-
- The RxMatch function can respond with more information
- than just a boolean return value. Its first parameter points
- to an area (provided by the calling program) where the
- function places the additional information. This area, called
- the feedback array, consists of 1-16 elements of the following
- type:
-
- typedef
- struct
- {
- int pos;
- int size;
- } RX;
-
- When an RxMatch call is unsuccessful because no match exists
- the feedback array is cleared to zeros. However, if the call
- failed due to some formal errors in the regular expression, a
- nonzero error code is inserted into the first word of the
- array (rx[0].pos). The error code can be converted into a text
- message by the RxMsgText function.
-
- A successful call to RxMatch places the relative position
- and the size of the matched substring in the pos and size
-
-
-
-
-
-
- 8
-
-
-
-
-
-
-
-
- fields of the first feedback array element. The remaining
- elements receive the positions and sizes of the substrings
- that match groups in the regular expression. For example, when
- we match:
-
- {[0-9]*}A+{[0-9]*} with XXX5678AAA876...
-
- the feedback array will have the following elements:
-
- 3; 10 - position and size of the match (5678AAA876)
- 3; 4 - first group (5678)
- 10; 3 - second group (876).
-
- The feedback array, filled up by a call to the RxMatch
- function, can be used by subsequent calls to the RxExtract and
- RxReplace functions. The RxExtract function copies one of the
- substrings identified by the match into another field. It can
- extract the whole match or a group match. The RxReplace
- function operates a little bit like the sprintf function. It
- replaces placeholders in a given string with the matching
- substring and/or the group matches.
-
- Naming conventions
-
- In variables and parameters , we use the prefix lsz to denote
- a long pointer to a zero terminated character string and the
- prefix rx for an feedback array (RX []). Otherwise, we follow
- the conventions from the Windows SDK.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- 9
-
-
-
-
-
-
-
-
-
- D. APPLICATIONS
-
-
-
-
- 1. Elimination of trailing white-space
-
- Truncate a given null-terminated character string lszText,
- eliminating all trailing white-space characters.
-
- RX rx[1];
-
- if (RxMatch(rx,1,"[ \t\n\v\f\r]+$",lszText))
- lszText[rx[0].pos] = '\0';
-
- Note that there is a blank character at the beginning of the
- character class. The escape sequences define respectively:
- tab, new line, vertical tab, form feed and carriage return.
- They have nothing to do with the escape sequences used by
- regular expressions. During compilation, the compiler changes
- them into the corresponding ASCII codes. When the RxMatch is
- called, the character class will contain six characters:
- blank, tab, new line, etc. We use the $ at the end to restrict
- the eventual match to the end of the tested string.
-
-
- 2. Reduction of white-space
-
- In a given, null-terminated, character string lszText, replace
- each sequence of white-space characters with a single space.
-
- LPSTR lpx;
- RX rx[1];
-
- lpx = lszText;
- while (RxMatch(rx,1,"[ \t\n\v\f\r][ \t\n\v\f\r]+",lpx))
- {
- lpx += rx[0].pos;
- *lpx++ = ' ';
- lstrcpy(lpx,lpx + rx[0].size);
- }
-
- In this example the regular expression matches two or more
- white-space characters. The feedback array is used to gain to
- the string and shrink it at the match point.
-
-
-
-
-
-
-
-
-
-
-
- 10
-
-
-
-
-
-
-
-
- 3. Parsing of a filename
-
- Assuming that a string lpFile contains a DOS file name, divide
- it into components: drive specification (X:), path (A\B\C\)
- and name proper. All of the components are optional.
-
- if (RxMatch(rx,4,"{.:}?{.*}{[^\\\\]+} *",lpFile))
- {
- RxExtract(rx,1,lpFile,szDrive,sizeof(szDrive));
- RxExtract(rx,2,lpFile,szPath,sizeof(szPath));
- RxExtract(rx,3,lpFile,szName,sizeof(szName));
- }
-
- The first group matches an optional drive designation. The
- last group matches whatever follows the last backslash or, if
- there is no backslash, anything that follows the drive
- designation. The middle group matches everything in between
- (i.e. strings like xxx\yyy\). The iteration at the end removes
- trailing spaces from the third group. Note, that this is not a
- test that lpFile contains a valid file name. The RxExtract
- calls extract the group matches and place them in szDrive,
- szPath and szName.
-
-
- 4. Validation of input
-
- Check if a given input field szIn contains a valid hexadecimal
- number and extract its components for further processing.
-
- char szHex[10];
- RX rx[3];
- int value;
-
- if (RxMatch(rx,3,"^ *{[+-]?} *{[0-9a-fA-F]*} *$",szIn))
- {
- RxExtract(rx,2,szIn,szHex,sizeof(szHex));
- --- convert szHex to value ---
- if (rx[1].size && szIn[rx[1].pos] == '-')
- value = -value;
- }
- else
- --- error - invalid input ---
-
- Here we anchored the match with ^$ so that the whole string is
- checked. Any extraneous characters (e.g. "-5b x" or "+ -25")
- will cause a mismatch. Spaces are accepted on both ends and
- after an optional sign. The feedback array is used to check if
- the number is negative.
-
-
-
-
-
-
-
-
-
- 11
-
-
-
-
-
-
-
-
- 5. Format change
-
- Check if a given input field szIn contains two decimal numbers
- separated with a comma and/or spaces. If it does, transfer
- them into another field with the following format:
-
- Length=x, Width=y
-
- where x and y are the numbers extracted from szIn.
-
- char szAux[50]; /* The output field */
- RX rx[4];
-
- if (RxMatch(rx,4,"^ *{[0-9]+} *{,| } *{[0-9]+} *$",szIn))
- {
- strcpy(szAux,"Length=%1, Width=%3");
- RxReplace(rx,4,szIn,szAux,sizeof(szAux));
- }
- else
- --- error - invalid input ---
-
- In this example we use two groups to access the data. An
- additional group (i.e. {,| }) is used to override the usual
- interpretation of the regular expression operators. The
- alternative "comma or space" has to be enclosed in brackets.
- Without them the scope of the | operator would be too large:
- " *{[0-9]+} *," OR " *{[0-9]+} *". The RxReplace function is
- used to replace the two placeholders (%1, %3) with the
- substrings matching the first and the third group.
-
-
- 6. Parsing of a telephone number
-
- Search a given string szText for a substring resembling a
- telephone number.
-
- static char szRex[] = "{(?[2-9][0-9][0-9])?}? *-? *"
- "{[2-9][0-9][0-9]} *-? *"
- "{[0-9][0-9][0-9][0-9]}";
- RX rx[4];
-
- if (RxMatch(rx,4,szRex,szText))
- {
- RxExtract(rx,1,lpText,lszArea,6);
- RxExtract(rx,2,lpText,lszExch,4);
- RxExtract(rx,3,lpText,lszNmbr,5);
- }
-
- When the match is successful, we transfer the area code,
- exchange and the last four digit number to the fields pointed
- to by lszArea, lszExch, lszNmbr.
-
-
-
-
-
-
- 12
-
-
-
-
-
-
-
-
-
- E. DEMONSTRATION PROGRAM
-
-
-
- This simple Windows application (RXTEST.EXE) can be used to
- learn how to compose regular expressions and what to expect
- from the functions provided by the library.
-
- When you first start RXTEST, it displays a dialog box with
- a number of empty text fields and one button (Execute). You
- can enter text into the top three fields. The remaining fields
- are used by the program to display results. The description of
- all the fields follows.
-
- Pattern
-
- A regular expression.
-
- Search area
-
- Any text to be searched for a substring matching the regular
- expression entered in the Pattern field.
-
- Replacement area
-
- Text to be passed to the RxReplace function.
-
- Return code
-
- The return value from the RxMatch function followed by an
- error message.
-
- Replacement result
-
- Text produced by the RxReplace function from the text in the
- replacement area field.
-
- 0: 1: 2: 3: 4: 5: 6: 7:
-
- Texts extracted from the search area field by the RxExtract
- calls referencing a group with the given number. The first
- group (number 0) is defined as the full match. The subsequent
- groups are defined by the brackets ({}) in the regular
- expression.
-
- Use the TAB/BACKTAB keys to move between the three input
- fields. Click the EXECUTE button or press the ENTER key to
- perform the match and related operations. Using different
- combinations of the input values, you can play with all the
- functions in the library.
-
-
-
-
-
-
- 13
-
-
-
-
-
-
-
-
-
- F. FUNCTION REFERENCE
-
-
-
- This chapter contains a list of functions from the Regular
- Expression DLL. The documentation for each function is
- organized in the way similar to that used in the Windows SDK.
-
- All function prototypes and other related declarations are
- contained in the header file wssrx.h. This header should be
- included (after windows.h) into all program files that refer
- to the functions in the DLL. The actual format of the include
- directive depends on the placement of the file. For example:
-
- #include "wssrx.h" current directory,
-
- #include <wssrx.h> a directory defined by the
- INCLUDE environment variable,
-
- #include <sub\wssrx.h> a subdirectory SUB of the above
- directory.
-
- Usually, you make direct calls to the library functions and
- use the import library (wssrx1.lib) when linking the program.
- In this case Windows will perform dynamic linking when the
- program is first loaded into memory. You can defer dynamic
- linking using the following technique:
-
- HANDLE hRx; /* Library handle */
- FARPROC lpfnRxMatch; /* To the RxMatch function */
-
- hRx = LoadLibrary("WSSRX1.EXE");
- lpfnRxMatch = GetProcAddress(hRx,OV_RXMATCH);
- - - -
- if (lpfnRxMatch(rx,3,"[0-9]+",szInput))
- {
- - - -
- }
- - - -
- FreeLibrary(hRx);
-
- The header file wssrx.h contains definitions of four symbols
- (OV_xxxxx) that can be used with the GetProcAddress call to
- retrieve the addresses of the respective Rx functions.
-
-
-
-
-
-
-
-
-
-
-
-
- 14
-
-
-
-
-
-
-
-
- 1. RxMatch - Match a Regular Expression
-
-
-
-
- BOOL RxMatch(rx, nLim, lszRex, lszTxt)
-
- This function searches the lszTxt string for a substring that
- matches the regular expression given by the lszRex string. The
- result of the search is reflected by the return value and by
- the values set in the feedback array defined by the rx and
- nLim parameters.
-
- Parameter Type/Description
-
- rx RX FAR [] Specifies an area to be used as
- the feedback array. If rx is NULL, no
- feedback information is returned.
-
- nLim int Specifies the number of elements in
- the feedback array rx. If nLim is zero or
- negative, no feedback information is
- returned.
-
- lszRex LPSTR Points to a null-terminated string
- that specifies the regular expression.
-
- lszTxt LPSTR Points to a null-terminated string
- that is to be matched with the regular
- expression.
-
- a) Return Value
-
- The return value specifies the outcome of the function. It is
- non-zero if the match was successful. Otherwise, it is zero.
-
- b) Comments
-
- The return value of zero means that either there was no match
- with the regular expression or the parameters received by the
- function were invalid. These two cases can be distinguished
- only when the feedback array is non-empty (i.e. when rx is not
- NULL and nLim is greater than 0). When the parameters received
- by the function are in error, the function places a non-zero
- error code in the first word of the feedback array. When
- everything is valid and only the match is unsuccessful, the
- function clears that word.
-
- All possible values of the error codes are defined in the
- wssrx.h header file as ERR_xxxx manifest constants. The
-
-
-
-
-
-
-
- 15
-
-
-
-
-
-
-
-
- RxMsgText function can be used to convert an error code to a
- text message.
-
- The feedback array does not have to be initialized in any
- way before the call to RxMatch. Only the first 16 entries of
- the array are effectively used by the functions in this
- library.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- 16
-
-
-
-
-
-
-
-
- 2. RxExtract - Extract a Matching Group
-
-
-
-
- LPSTR RxExtract(rx, nN, lszTxt, lszDst, nSize)
-
- This function assumes that the rx and lszTxt parameters have
- been used as arguments in a successful RxMatch call. The
- function copies a substring of lszTxt (adding the terminating
- null character), to the destination area specified by lszDst.
- The substring extracted by the function is defined as follows:
-
- ■ If nN is zero, it is the substring matching the whole
- regular expression in the RxMatch call.
-
- ■ If nN is greater than zero, it is the substring matching
- the nN-th group of the regular expression. If no group in
- the regular expression corresponds to nN, the substring
- is empty.
-
- If the extracted substring (including the terminating null) is
- longer than the destination length (nSize), it is truncated to
- fit in the destination area.
-
- Parameter Type/Description
-
- rx RX FAR [] Specifies an area used as a
- feedback array in a successful RxMatch
- call.
-
- nN int Specifies a group number in the
- regular expression used by the RxMatch
- call. This value should not exceed the
- value of the nLim argument passed to
- RxMatch.
-
- lszTxt LPSTR Points to a null-terminated string
- that has been matched with the regular
- expression using the RxMatch call.
-
- lszDst LPSTR Points to the buffer that receives
- the extracted substring.
-
- nSize int Specifies the number of characters
- (including the last null character) that
- can be copied to the buffer.
-
-
-
-
-
-
-
-
-
-
- 17
-
-
-
-
-
-
-
-
- a) Return Value
-
- The return value points to the extracted substring (same as
- lszDst).
-
- b) Comments
-
- The extracted substring is always terminated with the null
- character. If the receiving buffer is too short to accommodate
- the entire extracted substring, the function copies nSize-1
- leftmost characters and appends the null character.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- 18
-
-
-
-
-
-
-
-
- 3. RxReplace - Replace Placeholders
-
-
-
-
- LPSTR RxReplace(rx, nLim, lszTxt, lszDst, nSize)
-
- This function assumes that the rx, nLim and lszTxt parameters
- have been used as arguments in a successful RxMatch call. The
- function modifies the string defined by lszDst, replacing
- placeholders embedded in it with substrings extracted from the
- string defined by lszTxt.
-
- A placeholder is a single hexadecimal digit (0,1,...,e,f)
- preceded by the percent character (e.g. %2 or %a). The
- placeholder of the form %n is replaced by the substring of
- lszTxt determined as follows:
-
- ■ If n is zero, it is the substring matching the whole
- regular expression in the RxMatch call.
-
- ■ If n is greater than zero, it is the substring matching
- the n-th group of the regular expression. If no group in
- the regular expression corresponds to n, the substring is
- empty.
-
- If the modified string (including the terminating null) is
- longer than the destination length (nSize), it is truncated at
- the end to fit in the destination area.
-
- Parameter Type/Description
-
- rx RX FAR [] Specifies an area used as a
- feedback array in a successful RxMatch
- call.
-
- nLim int Specifies the size of the rx array
- used in the RxMatch call.
-
- lszTxt LPSTR Points to a null-terminated string
- that has been matched with the regular
- expression using the RxMatch call.
-
- lszDst LPSTR Points to the null-terminated
- string that is to be modified.
-
- nSize int Specifies the number of characters
- that can be used in the area defined by
- lszDst.
-
-
-
-
-
-
-
-
- 19
-
-
-
-
-
-
-
-
- a) Return Value
-
- The return value points to the last position (the terminating
- null character) of the modified string (lszDst). The value is
- NULL if during the operation some characters were lost due to
- lack of free space in lszDst.
-
- b) Comments
-
- The RxReplace function uses the area defined by lszDst and
- nSize as the only work area. It replaces the placeholders one
- by one starting from the left. After each replacement, the
- destination string expands, shrinks or does not change the
- length, depending on what replaces the placeholder. The
- destination area should be large enough to accommodate any one
- of those intermediate results.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- 20
-
-
-
-
-
-
-
-
- 4. RxMsgText - Build Error Message
-
-
-
-
- int RxMsgText(rx, lszMsg, nSize)
-
- This function assumes that the rx parameter have been used as
- an argument in an unsuccessful RxMatch call. If the RxMatch
- call failed because of errors, the function creates an error
- message in the buffer defined by the lszMsg parameter. If the
- RxMatch was unsuccessful because there was no match with the
- regular expression, the function puts an empty string in the
- buffer.
-
- If the message text string (including the terminating null) is
- longer than the buffer length (nSize), it is truncated at the
- end to fit in the buffer area.
-
- Parameter Type/Description
-
- rx RX FAR [] Specifies an area used as a
- feedback array in an unsuccessful RxMatch
- call.
-
- lszDst LPSTR Points to the buffer that receives
- the error message text.
-
- nSize int Specifies the number of characters
- (including the last null character) that
- can be copied to the buffer.
-
- a) Return Value
-
- The return value specifies the length of the error message. It
- is zero if there was no error condition detected.
-
- b) Comments
-
- The function can respond with one of the following messages:
-
- Rex or Txt parameter is NULL
-
- Rex too long
-
- {{{ }}} too deep
-
- Missing right brace
-
- Missing left brace
-
-
-
-
-
-
-
- 21
-
-
-
-
-
-
-
-
- Iteration (+*) on empty string
-
- Nested iteration
-
- Invalid range
-
- Missing right bracket
-
- Incomplete escape sequence
-
- Logic error
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- 22
-
-
-
-
-
-
-
-
-
- G. REGISTRATION FORM
-
-
-
- Name _____________________________________
-
- Company _____________________________________
-
- Title _____________________________________
-
- Address _____________________________________
-
- City, State ________________________ Zip ________
-
- Phone _____________________________________
-
-
-
-
-
- Registration fee # computers ____ x $10 ______
-
-
-
- Documentation and the source
- code on a disk - add: # computers ____ x $15 ______
-
-
-
- TOTAL ............................................... ______
-
-
-
-
-
- Diskette format for the source code (choose one)
-
- 5.25" disk _____ 3.5" disk _____
-
-
-
-
-
- Mail this form with your payment to:
-
- Windfall Software Systems
-
- 40 Windfall Lane
-
- Marlboro, NJ 07746
-
-
-
-
-
-
- 23
-
-
-
-
-
-
- ----------------end-of-author's-documentation---------------
-
- Software Library Information:
-
- This disk copy provided as a service of
-
- The Public (Software) Library
-
- We are not the authors of this program, nor are we associated
- with the author in any way other than as a distributor of the
- program in accordance with the author's terms of distribution.
-
- Please direct shareware payments and specific questions about
- this program to the author of the program, whose name appears
- elsewhere in this documentation. If you have trouble getting
- in touch with the author, we will do whatever we can to help
- you with your questions. All programs have been tested and do
- run. To report problems, please use the form that is in the
- file PROBLEM.DOC on many of our disks or in other written for-
- mat with screen printouts, if possible. The P(s)L cannot de-
- bug programs over the telephone.
-
- Disks in the P(s)L are updated monthly, so if you did not get
- this disk directly from the P(s)L, you should be aware that
- the files in this set may no longer be the current versions.
-
- For a copy of the latest monthly software library newsletter
- and a list of the 2,000+ disks in the library, call or write
-
- The Public (Software) Library
- P.O.Box 35705 - F
- Houston, TX 77235-5705
- (713) 665-7017
-